Improving Paragraph2Vec

نویسنده

  • Seokho Hong
چکیده

Paragraph vectors were proposed as a powerful unsupervised method of learning representations of arbitrary lengths of text. Although paragraph vectors had the advantage of being versatile, being unsupervised and unconstrained by lengths of text, the concept has not been further developed since its first publication. We propose two extensions upon the initial formulation of the paragraph vector, and test its performance on two separate semantic-based tasks. Although the results are limited by the fact that our attempt to reproduce the original paragraph vectors was not successful, we can still show that the extended models outperform the original paragraph vectors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Masters Thesis: Exploiting Embedding in Content-Based Recommender Systems

XING is a leading career-oriented social networking site in Europe, which usually recommend job ads to their customers. One of the widely used methods in Recomender Systems is content-based filtering, which analyzes the description of item characteristics and the user profile illustrating user’s preferences. Due to the sparsity of its dataset, i.e. many job postings are rarely interacted with, ...

متن کامل

Interpretable probabilistic embeddings: bridging the gap between topic models and neural networks

We consider probabilistic topic models and more recent word embedding techniques from a perspective of learning hidden semantic representations. Inspired by a striking similarity of the two approaches, we merge them and learn probabilistic embeddings with online EM-algorithm on word co-occurrence data. The resulting embeddings perform on par with Skip-Gram Negative Sampling (SGNS) on word simil...

متن کامل

Deep Learning Approaches to the Multi-Instance Multi-Label (MIML) Learning Problem

Multi-instance Multi-label learning (MIML) problem is that, given a bag of instances and a set of labels, the task is to assign labels to the bag, which are relevant to the bag as a whole. The problem of MIML finds its relevance in relation extraction, vision, machine learning and information extraction. In this piece of work, our aim is to develop a scalable deep learning approach towards the ...

متن کامل

Identifying and Prioritizing Strategies for Improving Financing Systems of Iran's Oil and Gas Industry

The oil and gas industry has huge financial turnover and major projects, especially in the upstream areas, require substantial financing. Hence, securing financing is one of the most important requirements for successful implementation of projects in this industry. In this research, we adopt a descriptive approach and rely on the opinion of experts, to identify and prioritize strategies for imp...

متن کامل

The role of integrated urban management in improving crisis management and improving the quality of public services to citizens (Case study: Tehran province)

The purpose of this study was to investigate the role of integrated urban management in improving crisis management and improving the quality of public services to citizens. In order to conduct this research, Integrated urban management theory, Urban crisis management theory, Public services theory and the relationship of Integrated urban management theory with Public services theory have been ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015